Fast topology bus router for interconnect planning

ABSTRACT

A method includes receiving a netlist for a chip including a bus and determining, by one or more processors and based on the netlist, a first routing topology for the bus and through a routing region of the chip by comparing a demand of the bus to a capacity of a plurality of cells of the routing region. The method also includes generating a layout for the chip based on the first routing topology.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to and the benefit of U.S. ProvisionalPatent Application Ser. No. 63/038,622, entitled “Fast Topology BusRouter for Interconnect Planning,” filed Jun. 12, 2020, which isincorporated herein by reference in its entirety.

TECHNICAL FIELD

The present disclosure relates to layout design for an integratedcircuit. More particularly, the present disclosure relates to layout ofinterconnect buses for chips having multiple modules.

BACKGROUND

As technology advances, the design size of chips has increaseddramatically as more functional modules are incorporated in one chip.Many integrated circuit designs have multiple child design modules thatcommunicate with each other. In a high-speed system-on-chip (SoC)design, top level interconnect buses, which connect different designmodules, can be a dominant factor in determining final chip performanceand power usage. In various instances, multiple iterations fromregister-transfer level (RTL) design to final physical floorplan arecompleted to meet timing requirements. One of the stages involved insuch iterations is top level interconnect bus planning. Since thosetop-level interconnect buses can be critical to determine the chipperformance and feasibility, a fast and accurate methodology toimplement and verify the timing and/or feasibility of those interconnectbuses during an iteration is advantageous. However, due to thecomplexity involved in such planning, in many current implementations,the interconnect buses are manually planned. As the number ofinterconnect buses increase dramatically, the manual planning processincreases in time and/or effort. In many instances, the amount of timerequired to complete the top-level interconnect bus planning inprohibitive.

SUMMARY

According to an embodiment, a method includes receiving a netlist for achip comprising a bus and determining, by one or more processors andbased on the netlist, a first routing topology for the bus and through arouting region of the chip by comparing a demand of the bus to acapacity of a plurality of cells of the routing region. The method alsoincludes generating a layout for the chip based on the first routingtopology.

The method may further include constructing, based on the netlist, atarget for the bus. The bus may include a plurality of pins of a sourceconnected to a plurality of pins of the target. Constructing the targetmay include determining a center pin of the plurality of pins of thesource and a center pin of the plurality of pins of the target.

Comparing the demand of the bus to the capacity of the plurality ofcells of the routing region may include comparing a first demand of afirst portion of the bus to a capacity of a first cell of the pluralityof cells and comparing a second demand of a second portion of the bus toa capacity of a second cell of the plurality of cells. The first cellmay be adjacent to the second cell in the routing region. The firstdemand may be greater than the second demand. In the first routingtopology, the first portion of the bus may be routed through the firstcell and the second portion of the bus is routed through the secondcell.

The method may also include determining, for the bus, a plurality ofrouting topologies through the routing region, the plurality of routingtopologies comprising the first routing topology and determining a costfor each of the plurality of routing topologies by comparing a demand ofthe bus to a capacity of a plurality of cells of the routing region forthe respective routing topology. The respective cost for the firstrouting topology may be lower than the costs for the other routingtopologies of the plurality of routing topologies. The method may alsoinclude increasing a cost for a second routing topology of the pluralityof routing topologies in response to determining that a plurality ofcells for the second routing topology includes a blockage. The methodmay further include comparing a cost of a second routing topology of theplurality of routing topologies with a cost of a third routing topologyof the plurality of routing topologies.

The method may also include undoing the first routing topology inresponse to determining that the first routing topology does not meet atleast one of a timing requirement or a topology requirement. The methodmay further include adjusting the netlist in response to determiningthat the first routing topology does not meeting at least one of thetiming requirement or the topology requirement.

The method may also include duplicating the first routing topology for aplurality of bits of the bus.

According to another embodiment, an apparatus includes a memory and ahardware processor communicatively coupled to the memory. The hardwareprocessor receives a netlist for a chip comprising a bus and determines,based on the netlist, a first routing topology for the bus and through arouting region of the chip by comparing a demand of the bus to acapacity of a plurality of cells of the routing region. The hardwareprocessor also generates a layout for the chip based on the firstrouting topology.

The hardware processor may also construct, based on the netlist, atarget for the bus. The bus may include a plurality of pins of a sourceconnected to a plurality of pins of the target. Constructing the targetmay include determining a center pin of the plurality of pins of thesource and a center pin of the plurality of pins of the target.

Comparing the demand of the bus to the capacity of the plurality ofcells of the routing region may include comparing a first demand of afirst portion of the bus to a capacity of a first cell of the pluralityof cells and comparing a second demand of a second portion of the bus toa capacity of a second cell of the plurality of cells. The first cellmay be adjacent to the second cell in the routing region. The firstdemand may be greater than the second demand. In the first routingtopology, the first portion of the bus may be routed through the firstcell and the second portion of the bus is routed through the secondcell.

The hardware processor may also determine, for the bus, a plurality ofrouting topologies through the routing region, the plurality of routingtopologies comprising the first routing topology and determine a costfor each of the plurality of routing topologies by comparing a demand ofthe bus to a capacity of a plurality of cells of the routing region forthe respective routing topology. The respective cost for the firstrouting topology may be lower than the costs for the other routingtopologies of the plurality of routing topologies. The hardwareprocessor may also increase a cost for a second routing topology of theplurality of routing topologies in response to determining that aplurality of cells for the second routing topology includes a blockage.

The hardware processor may also duplicate the first routing topology fora plurality of bits of the bus.

According to another embodiment, a method includes receiving a netlistfor a chip comprising a bus and determining, by one or more processorsand based on the netlist, a first cost for a first routing topology forthe bus and through a routing region of the chip by comparing a demandof the bus to a capacity of a first plurality of cells of the routingregion. The method also includes determining, by the one or moreprocessor and based on the netlist, a second cost for a second routingtopology for the bus and through the routing region of the chip bycomparing the demand of the bus to a capacity of a second plurality ofcells of the routing region. The second plurality of cells is differentfrom the first plurality of cells. The method also includes, in responseto determining that the first cost is lower than the second cost,generating a layout for the chip based on the first routing topology.

BRIEF DESCRIPTION OF THE DRAWINGS

The disclosure will be understood more fully from the detaileddescription given below and from the accompanying figures of examplesdescribed herein. The figures are used to provide knowledge andunderstanding of examples described herein and do not limit the scope ofthe disclosure to these specific examples. Furthermore, the figures arenot necessarily drawn to scale.

FIG. 1 illustrates a method of top level topology planning forinterconnect buses in accordance with some examples of the presentdisclosure.

FIG. 2A illustrates maze wave expansion in global routing in accordancewith some examples of the present disclosure.

FIG. 2B illustrates capacity and demand calculation for a global cell inaccordance with some examples of the present disclosure.

FIG. 3 illustrates a method of generating a chip layout in accordancewith some examples of the present disclosure.

FIGS. 4, 5, and 6 illustrate routing target creation in accordance withsome examples of the present disclosure.

FIG. 7 illustrates bus routing results in accordance with some examplesof the present disclosure.

FIG. 8 illustrates capacity and demand calculation for a 32-bit bus inaccordance with some examples of the present disclosure.

FIGS. 9A and 9B illustrate blockage detection in accordance with someexamples of the present disclosure.

FIG. 10 illustrates a method of generating a chip layout in accordancewith some examples of the present disclosure.

FIG. 11 depicts a flowchart of various processes used during the designand manufacture of an integrated circuit in accordance with someexamples of the present disclosure.

FIG. 12 depicts a diagram of an example computer system in whichexamples of the present disclosure may operate.

DETAILED DESCRIPTION

Aspects described herein relate to fast topology bus router forinterconnect planning. In the chip design process, to meet topologyrequirements and timing goals, interconnect bus planning can be a timeintensive and, in many instances, a manually intensive process. Theinterconnect bus planning process may include many iterations,increasing the amount of time required for the planning process.Further, as the number of interconnect buses increases, the interconnectbus planning process increases in time. In many instances, routing busbits are not kept together when generating the routing topology forinterconnect buses. However, since the bus bits have similarconnectivity and have a same or similar timing requirement, ideally allthe routing bits with same routing topology would be kept together.Additionally, or alternatively, in many instances, a routing processdoes not utilize the available resource and blockages to generate therouting solution or solutions. In the following, an improved chip designprocess having an improved routing methodology is described that keepsbus bit routing together while employing a routing runtime that has anincreased speed as compared to current methods.

The following describes an improved global routing-based method that canprovide a faster routing process and can keeps the bus bits routingtogether. The global routing-based method can consider the availablerouting resources and can account for routing blockages (congestion),which can permit the global routing-based method to deliver improvedquality of results (QoR).

FIG. 1 illustrates a top-level interconnect bus planning flow for a chipdesign process. The flow may be executed by one or more processors(e.g., one or more of the processing device 1202 of FIG. 12). In one ormore embodiments, the flow may be executed by one or more processors ofone or more servers. In particular, the method of FIG. 1 includesreceiving RTL data (at 110) with top level interconnect buses. The RTLdata may be a netlist. At 112, topology constraints are applied to theRTL data, and at 116, topology plans are generated based on theconstraints and RTL data. Generating the topology plans at 116 mayinclude performing an optimization process. The optimization process mayinvolve selecting some improvement (if an improvement is available),within the structure of the algorithm implemented, of some identifiedcharacteristic, and does not imply an absolute or global optimal (as theterm is colloquially used) improvement of the characteristic. Forexample, in some situations where an optimization process may determinea minimum, the minimum may be a local minima rather than the globalminimum.

A decision is made to determine if the generated topology plans meettiming and/or topology requirements (at 118). In response to the resultsnot meeting timing and/or topology requirements, the current results areremoved (or undone) at 120, and more iterations are performed to achievethe final timing and/or topology goal. At 114, if an RTL change isrequired, the process returns to the RTL input stage (at 110). If no RTLchange is required, the process returns to add and/or modify topologyplan constraints (at 112) such that the timing and/or topology goal canbe achieved. Such iterations are repeated until the final timing/routingtopology are satisfied. At 122, if there are more topology plans toprocess, additional topology constraints are applied (at 112). If thereare no additional topology plans to process, the process is completed(at 124).

The following description focuses on the stage that generatesinterconnect bus routings to meet a timing and/or topology requirementas shown in 116 of the flowchart of FIG. 1. A topology bus router (e.g.,global-routing-based topology bus router) is utilized to derive therouting topology. An interconnect bus (or bus) includes one or morebits. Further, each bit of a bus may correspond to a different wire (ortrace) of the bus. In one example, a 32-bit bus has 32 wires. Thetopology bus router is based on global routing technology that dividesthe chip into small grids. Within each grid, the available tracks arecalculated as capacity on each routing layer.

During global routing, complex design rules are observed and a design iscaptured in a grid graph. For example, in a grid graph, complex designrules may be shown as routings and pins. During global routing, asillustrated in FIG. 2A, each layer of the entire routing region ispartitioned into rectangular regions called global cells (gcell) 210.While performing global routing, the designs based on the objects (e.g.,pins, blockages, and/or cell instances, etc.) within the design areextracted and the available resources (e.g., available wiretracks oneach routing layer) are determined. The objects and resources areextracted and converted in each gcell grid (e.g., gcells 210). Eachrouting segment is extracted and used to determine demand within thecorresponding gcells 210. The capacity and/or demand for each gcell gridand the global routing results may be determined at the end of theglobal routing process. As will be discussed in the following, thecapacity and demand information for each gcell 210 is utilized todetermine if a design is congested or not after global routing iscompleted.

The gcells include pins 220 a, 220 b, and routing 230. Element 240corresponds to the used wire tracks within the gcell 210 a.

The capacity for a gcell, is the maximum free number of wires that cancross the gcell on a particular layer. The demand is the actual numberof wires going through the gcell. The overflow is the number ofdifferences between demand and capacity. A negative overflow numberimplies that the gcell has free wiretrack(s) available to allow morerouting (demand) to go through this gcell. A positive number impliesthat the demand is more than capacity and should be avoided duringglobal routing. FIG. 2B illustrates the capacity and demand calculationfor the gcell 210 a. The horizontal dashed lines in the gcell 210 arepresent the wire tracks. The rectangular region with the vertical barsin the gcell 210 a represents a blockage. The solid horizontal line inthe gcell 210 a represents a routing. As seen in this example, two ofthe wire tracks are blocked and thus, the gcell 210 a has a capacity often as ten out of twelve wiretracks are available. The unavailablewiretracks correspond to wiretracks that are already utilized forrouting. For example, in the embodiment of FIG. 2B, two of the twelvewiretracks are used for routings. Further, the demand is one as onerouting passes through the gcell 210 a.

For routing buses to generate a topology, the bus bits (e.g., wires ortraces) of a bus are routed separately based on the topologyconstraints. When routing each bus bit, different bus bits see the gcellcapacity demand differently. As a result, all the bus bits cannot beguaranteed to be routed with the same topology. Accordingly, the busnets routing split with different topologies and lead to undesiredresults. Further, runtime is slow as the bus bits (e.g., wires ortraces) are routed separately. The overall runtime depends on the numberof bus bits.

However, as is presented in the FIGS. 3-9 and described in thecorresponding description, an improved topology bus routing method isdescribed that can generate the same routing topology for all the busbits with reduced runtime as compared to previous methods.

The topology bus routing method includes constructing routing targetsbased on the bus connection. Constructing the routing targets isexecuted by one or more processors (e.g., one or more of the processor1202). At 310 of FIG. 3, netlist information is received. For example,the netlist information is provided at the output of the netlistverification step 1120 of FIG. 11. At 320 of FIG. 3, a routing target(e.g., a feature) that traverses all of the bus bit connections isgenerated. Routing targets based on all the bus bits connections aregenerated from the target. The target includes a plurality of pins,where each bit of a bus is connected to a different pin of the target.Each bus bit has a group of connected pins. For example, each bus bit isconnected between a pin of a source and of a target. Different bus bitsof a bus generally have similar connectivity. A center of the connectedpins (e.g., a center of gravity) is determined and used to createrouting targets for the topology bus router. FIG. 4 illustrates how tocreate routing targets. The pins 402, 404 illustrate how the routingtargets appear. At 330 of FIG. 3, a routing topology for each bit of abus is generated. A more detailed description of generating a routingtopology for each bit of a bus is illustrated in FIGS. 7-9 and describedin corresponding description. At 340 of FIG. 3, a chip layout isgenerated. For example, generating the chip layout may include any oneor more of steps 1124-1140 of FIG. 11.

In the embodiment of FIG. 4, the bus bits have two pins for each net.For example, more than 90% of nets are 2-pin nets. In one or moreembodiments, a respective net has more than 2-pins. In such instances,the bus is called a branching bus. Further, in such embodiments, thepins are grouped based on the connectivity and the pin locations. In theembodiment of FIG. 5, there are three routing targets 502, 503, and 504for the bus. A first target pin 503 is created based on the connectionof the child block. Two additional target pins 502 and 504 are createdbased on the top-level block connections. As there are two top levelpins connected for each bus bit, two target pins 503 and 504 arecreated. Further, two target pins 503 and 504 are created, one for eachside (edge) of the top level pin layout. After the target pins 502, 503,and 504 are created, routings for the three target pins 502, 503, and504 are created. Line 406 in FIG. 4 and line 506 in FIG. 5 correspond tothe results of global routing. The results are duplicated to each of thebus bits within a bus to create the actual routings.

In one or more embodiments, one or more bus bits in a bus may have adifferent connection from other bus bits in the bus. For example, asshown in FIG. 6, 2-bits are connected to the target pins on the left(e.g., the target pins 602), and 2-bits are connected to the target pinson the right (e.g., the target pins 604). Each of the bus bits has twopins on the net, but the pin locations are not located together.Accordingly, three target pins are created.

The routing target or targets are connected to derive the routingtopology for the whole bus once the routing target or targets arecreated. The routing topology method may be executed by one or moreprocessors (e.g., one or more of the processor 1202) to derive the wholebus once the routing target or targets are created. Global routingtechnology may be utilized as the routing engine to perform the routing.In the embodiment of FIG. 7, when maze wave expansion in global routingis performed, the router expands the wave 702 from source to targetbased on the combined demand for all the bus bits. Maze wave expansionis utilized to determine a routing path within the gcells for a bus.

When estimating the demand for the bus bits, the demand calculationconsiders the array of the gcells together based on bus width. One ormore processors (e.g., one or more of the processor 1202) estimates thedemand for the bus bits. FIG. 8 shows how the demand is calculated whenmaze wave expands from the left of a gcell array to the right gcellarray for horizontal routing. Maze wave expansion may be executed by oneor more processors (e.g., one or more of the processor 1202). Thevertical routing is similar as horizontal routing. In various instances,maze wave expansion includes horizontal and/or vertical routing withinthe gcell array. As illustrated in FIG. 8, when performing maze waveexpansion horizontally, the process starts from the center gcell (e.g.,gcell 802) and expands to upper/lower gcells (e.g., gcells 804 and 806)repeatedly until all of the available tracks in those gcells reaches thenumber of bus bits. In one embodiment, a first portion of the demand fora bus is compared to the capacity of a first gcell (e.g., gcell 802) anda second portion of the demand for a bus is compared to the capacity ofa second gcell (e.g., the gcell 804 or 806). The first portion may begreater, less than, or equal to the second portion. As illustrated inFIG. 8, the capacity of the gcells 802, 804, and 806 exceeds thecorresponding demand. Accordingly, maze wave expansion includes gcells802, 804, and 806 when generating the routing path. For a 32-bit bus,the total demand is 32 which is divided across gcells (e.g., the gcells802, 804, and 806). The demand of the middle gcell (e.g., the gcell 802)is greater than the demand of the adjacent gcells (e.g., the gcells 804and 806). For example, the capacity of the gcell 802 (e.g., the middlegcell) is 12 while the demand is 12, and the capacity of the gcells 804and 806 is 12 while the demand is 10.

In embodiments where the bus is a ripping bus, as illustrated in FIG. 5,the demand is derived based on the connected target pins as not all thebus bit connect to same target pin locations. Line 506 corresponds tothe results of global routing. The results are duplicated to each of thebus bits within a bus to create the actual routings. As the bus in FIG.5 is a ripping pus, the routing segments from the line 506 are selectedbased on the corresponding pins. For example, the bus bits of the busconnecting to the terminals on source select the left routing segmentsfrom the line 506.

When performing wave expansion, to determine if a blockage of theexpansion exists, the array of gcells (e.g., the array 800) is checkedfor a possible blockage together. When one or more of the gcells of thegcell arrays is blocked, routing through those gcell arrays is initiallyprevented. As illustrated in FIG. 9A, the route through adjacent gcells902A, 902B, and 902C is being evaluated. The gcell 902A of the gcellarray 900 is blocked by blockage 904. The blocked gcell or gcells referto gcells that have a demand higher than a corresponding capacity.Arrays having a blocked gcell or gcells may be utilized when anothervalid routing path cannot be identified. In such embodiments, utilizinga routing path having a blocked gcell has an increased cost as comparedto a routing path that does not include blocked gcells. For example, asecond iteration may attempt to route the bus through gcells 902B, 902C,and a gcell 902D immediately below gcell 902C. This path may introduce aturn into the bus. If the gcell 902D is not blocked, then the secondpath will have a lower cost than the first path through gcells 902A,902B, and 902C. As a result, the second path will be selected for therouting topology over the first path.

In the example of FIG. 9B, a path through the gcell array 904 is beingconsidered. Specifically, the path routes the bus through adjacentgcells 906A, 906B, and 906C of the gcell array 904. Each of the gcells906A, 906B, and 906C includes a partial blockage 907. Even if the gcells906A, 906B, and 906C have enough capacity to meet the demand of the bus,the wires or bits of the bus would still be separated when being routedthrough the gcells 906A, 906B, and 906C. As a result, such a route wouldbe assigned a cost that is higher than another route in which the wiresor bits of the bus are not split or separated during routing.Additionally, the cost for this route would be lower than the route inthe example of FIG. 9A in which the entire gcell 902A was blocked.

In some embodiments, several paths are determined through a gcell arrayand the costs for each path are determined. The cost for a path reflectshow well the capacity of the gcells in that path meet the demand of thebus. An ideal path would be a path in which the bus is routed throughgcells such that there are a minimal number of blockages and a minimalnumber of turns in the bus. The lowest possible cost would be for a paththat has no blockages and that is completely straight. The path with thelowest cost is then selected as the routing topology for the bus. FIG.10 is a flowchart of an example method of selecting a path as therouting topology for a bus. In certain embodiments, the steps of themethod are performed as part of 330 in FIG. 3. In 1010, a first cost fora first routing topology is determined. The first cost may depend onseveral factors, such as (1) the number of blockages in the firstrouting topology that reduce the capacity of the gcells in the firstrouting topology, (2) the number of turns in the first routing topology,and/or (3) the number of times the wires or bits of a bus split orseparate in the first routing topology. Using the example of FIG. 9A, acost for the routing topology through the gcells 902B, 902C, and 902Cmay be determined. Although there may be no blockages in the secondrouting topology, there may still be a cost due to the second factor(e.g., the routing topology may have to turn downwards and then to theright to route through the gcells 902B, 902C, and 902D). In 1020, asecond cost for a second routing topology is determined. The second costmay depend on the same factors for the second routing topology as thefirst cost does for the first routing topology. Using the example ofFIG. 9A, a high cost for the routing topology through the gcells 902A,902B, and 902C may be determined because of the first factor (e.g., theblockage 903 reduces the capacity of the gcell 902A or the gcells 902A,902B, and 902C below the demand for the gcell 902A or the gcells 902A,902B, and 902C).

In 1030, it is determined that the first cost is lower than the secondcost. In response, in 1040, the first routing topology is selected forthe bus. The first routing topology may then be used to generate a chiplayout.

In some embodiments, a routing topology is determined iteratively. Usingthe example of FIG. 9A, at a first step, a left to right routing isattempted through gcells 902A, 902B, and 902C. When the blockage ingcell 902A is determined, a high cost is assigned to this first step. Ata second step, a left to right routing is attempted through gcells 902B,902C, and a gcell immediately below gcell 902C. A cost for that step isdetermined. This process continues until several possibilities for theleft to right routing are considered. The step with the lowest cost isselected. The next step in the routing path is then determined using thepreviously selected step as a starting point. For example, if the stepthat routed the bus through gcells 902B, 902C, and the gcell immediatelybelow gcell 902C is selected, then the next step may be another left toright routing through the gcells that are immediately to the right ofthe gcells 902B, 902C, and the gcell immediately below gcell 902C.Several different pathing options may be evaluated and the path with thelowest cost is selected as the next step. After the bus is routed to thetarget, the total cost of the path is determined. The process may thenreturn to a previous step in the iterations to determine another path tothe target and its associated cost. After several paths are determined,the path with the lowest cost is selected as the routing topology forthe bus.

When the routing is completed, the routing topology can be duplicated toall the bus bits to generate the actual detail routing topology. Forexample, a routing topology is generated using the above describedmethod for a first bit of a bus, and the routing topology may then beused to generate a routing topology of a second bit of the bus. Invarious embodiments, as the routing is performed once, the runtime isfaster as compared to methods that route all the bus bits one by one.The runtime and QoR compared between the above described global-routingbased bus router and conventional global routings methods on all the busbits illustrates that the speed of above described global-routing basedbus router is about 2 to about 200 times the speed of conventionalmethods, depending on the number of bits in a bus.

FIG. 11 illustrates an example set of processes 1100 used during thedesign, verification, and fabrication of an integrated circuit on asemiconductor die to transform and verify design data and instructionsthat represent the integrated circuit. Each of these processes can bestructured and enabled as multiple modules or operations. The term “EDA”signifies Electronic Design Automation. These processes start, at block1110, with the creation of a product idea with information supplied by adesigner, information that is transformed to create an integratedcircuit that uses a set of EDA processes, at block 1112. When the designis finalized, the design is taped-out, at block 1134, which is whenartwork (e.g., geometric patterns) for the integrated circuit is sent toa fabrication facility to manufacture the mask set, which is then usedto manufacture the integrated circuit. After tape-out, at block 1136,the integrated circuit is fabricated on a semiconductor die, and atblock 1138, packaging and assembly processes are performed to produce,at block 1140, the finished integrated circuit (oftentimes, alsoreferred to as “chip” or “integrated circuit chip”).

Specifications for a circuit or electronic structure may range fromlow-level transistor material layouts to high-level descriptionlanguages. A high-level of representation may be used to design circuitsand systems, using a hardware description language (HDL) such as VHDL,Verilog, SystemVerilog, SystemC, MyHDL or OpenVera. The HDL descriptioncan be transformed to a logic-level register transfer level (RTL)description, a gate-level description, a layout-level description, or amask-level description. Each lower representation level that is a moredetailed description adds more useful detail into the designdescription, such as, for example, more details for the modules thatinclude the description. The lower levels of representation that aremore detailed descriptions can be generated by a computer, derived froma design library, or created by another design automation process. Anexample of a specification language at a lower level of representationlanguage for specifying more detailed descriptions is SPICE, which isused for detailed descriptions of circuits with many analog components.Descriptions at each level of representation are enabled for use by thecorresponding tools of that layer (e.g., a formal verification tool). Adesign process may use a sequence depicted in FIG. 11. The processesdescribed may be enabled by EDA products (or tools).

During system design, at block 1114, functionality of an integratedcircuit to be manufactured is specified. The design may be optimized fordesired characteristics such as power consumption, performance, area(physical and/or lines of code), and reduction of costs, etc.Partitioning of the design into different types of modules or componentscan occur at this stage.

During logic design and functional verification, at block 1116, modulesor components in the circuit are specified in one or more descriptionlanguages and the specification is checked for functional accuracy. Forexample, the components of the circuit may be verified to generateoutputs that match the requirements of the specification of the circuitor system being designed. Functional verification may use simulators andother programs such as testbench generators, static HDL checkers, andformal verifiers. In some examples, special systems of components,referred to as emulators or prototyping systems, are used to speed upthe functional verification.

During synthesis and design for test, at block 1118, HDL code istransformed to a netlist. In some examples, a netlist may be a graphstructure where edges of the graph structure represent components of acircuit and where the nodes of the graph structure represent how thecomponents are interconnected. Both the HDL code and the netlist arehierarchical articles of manufacture that can be used by an EDA productto verify that the integrated circuit, when manufactured, performsaccording to the specified design. The netlist can be optimized for atarget semiconductor manufacturing technology. Additionally, thefinished integrated circuit may be tested to verify that the integratedcircuit satisfies the requirements of the specification.

During netlist verification, at block 1120, the netlist is checked forcompliance with timing constraints and for correspondence with the HDLcode. During design planning, at block 1122, an overall floor plan forthe integrated circuit is constructed and analyzed for timing andtop-level routing. In one or more embodiments, the method describedabove with regard to FIGS. 1-9, may be performed as part of designplanning at block 1122.

During layout or physical implementation, at block 1124, physicalplacement (positioning of circuit components, such as transistors orcapacitors) and routing (connection of the circuit components bymultiple conductors) occurs, and the selection of cells from a libraryto enable specific logic functions can be performed. As used herein, theterm “cell” may specify a set of transistors, other components, andinterconnections that provides a Boolean logic function (e.g., AND, OR,NOT, XOR) or a storage function (such as a flip-flop or latch). As usedherein, a circuit “block” may refer to two or more cells. Both a celland a circuit block can be referred to as a module or component and areenabled as both physical structures and in simulations. Parameters arespecified for selected cells (based on standard cells) such as size andmade accessible in a database for use by EDA products.

During analysis and extraction, at block 1126, the circuit function isverified at the layout level, which permits refinement of the layoutdesign. During physical verification, at block 1128, the layout designis checked to ensure that manufacturing constraints are correct, such asdesign rule check (DRC) constraints, electrical constraints,lithographic constraints, and that circuitry function matches the HDLdesign specification. During resolution enhancement, at block 1130, thegeometry of the layout is transformed to improve how the circuit designis manufactured.

During tape-out, data is created to be used (after lithographicenhancements are applied if appropriate) for production of lithographymasks. During mask data preparation, at block 1132, the tape-out data isused to produce lithography masks that are used to produce finishedintegrated circuits.

A storage subsystem of a computer system (such as computer system 1200of FIG. 12) may be used to store the programs and data structures thatare used by some or all of the EDA products described herein, andproducts used for development of cells for the library and for physicaland logical design that use the library.

FIG. 12 illustrates an example of a computer system 1200 within which aset of instructions, for causing the computer system to perform any oneor more of the methodologies discussed herein, may be executed. In someimplementations, the computer system may be connected (e.g., networked)to other machines or computer systems in a local area network (LAN), anintranet, an extranet, and/or the Internet. The computer system mayoperate in the capacity of a server or a client computer system inclient-server network environment, as a peer computer system in apeer-to-peer (or distributed) network environment, or as a server or aclient computer system in a cloud computing infrastructure orenvironment.

The computer system may be a personal computer (PC), a tablet PC, aset-top box (STB), a personal digital assistant (PDA), a cellulartelephone, a web appliance, a server, a network router, a switch orbridge, or any machine capable of executing a set of instructions(sequential or otherwise) that specify actions to be taken by thatcomputer system. Further, while a single computer system is illustrated,the term computer system shall also be taken to include any collectionof computer systems that individually or jointly execute a set (ormultiple sets) of instructions to perform any one or more of themethodologies discussed herein.

The example computer system 1200 includes a processing device 1202, amain memory 1204 (e.g., read-only memory (ROM), flash memory, dynamicrandom access memory (DRAM) such as synchronous DRAM (SDRAM), a staticmemory 1206 (e.g., flash memory, static random access memory (SRAM),etc.), and a data storage device 1218, which communicate with each othervia a bus 1230. The main memory 1204 includes or is a non-transitorycomputer readable medium. The main memory 1204 (e.g., a non-transitorycomputer readable medium) can store one or more sets of instructions1226, that when executed by the processing device 1202, cause theprocessing device 1202 to perform some or all of the operations, steps,methods, and processes described herein.

Processing device 1202 represents one or more processors such as amicroprocessor, a central processing unit, or the like. Moreparticularly, the processing device 1202 may be or include complexinstruction set computing (CISC) microprocessor, reduced instruction setcomputing (RISC) microprocessor, very long instruction word (VLIW)microprocessor, a processor implementing other instruction sets, orprocessor(s) implementing a combination of instruction sets. Processingdevice 1202 may also be one or more special-purpose processing devicessuch as an application specific integrated circuit (ASIC), a fieldprogrammable gate array (FPGA), a digital signal processor (DSP),network processor, or the like. The processing device 1202 may beconfigured to execute instructions 1226 for performing some or all ofthe operations, steps, methods, and processes described herein.

The computer system 1200 may further include a network interface device1208 to communicate over the network 1220. The computer system 1200 alsomay include a video display unit 1210 (e.g., a liquid crystal display(LCD) or a cathode ray tube (CRT)), an alphanumeric input device 1212(e.g., a keyboard), a cursor control device 1214 (e.g., a mouse), agraphics processing unit 1222, a signal generation device 1216 (e.g., aspeaker), graphics processing unit 1222, video processing unit 1228, andaudio processing unit 1232.

The data storage device 1218 may include a machine-readable storagemedium 1224 (e.g., a non-transitory computer-readable medium) on whichis stored one or more sets of instructions 1226 or software embodyingany one or more of the methodologies or functions described herein. Theinstructions 1226 may also reside, completely or at least partially,within the main memory 1204 and/or within the processing device 1202during execution thereof by the computer system 1200, the main memory1204 and the processing device 1202 also including machine-readablestorage media.

In some implementations, the instructions 1226 include instructions toimplement functionality described above. While the machine-readablestorage medium 1224 is shown in an example implementation to be a singlemedium, the term “machine-readable storage medium” should be taken toinclude a single medium or multiple media (e.g., a centralized ordistributed database, and/or associated caches and servers) that storethe one or more sets of instructions. The term “machine-readable storagemedium” shall also be taken to include any medium that is capable ofstoring or encoding a set of instructions for execution by the computersystem and that cause the computer system and the processing device 1202to perform any one or more of the methodologies described above. Theterm “machine-readable storage medium” shall accordingly be taken toinclude, but not be limited to, solid-state memories, optical media, andmagnetic media.

Some portions of the preceding detailed descriptions have been presentedin terms of algorithms and symbolic representations of operations ondata bits within a computer memory. These algorithmic descriptions andrepresentations are the ways used by those skilled in the dataprocessing arts to most effectively convey the substance of their workto others skilled in the art. An algorithm may be a sequence ofoperations leading to a desired result. The operations are thoserequiring physical manipulations of physical quantities. Such quantitiesmay take the form of electrical or magnetic signals capable of beingstored, combined, compared, and otherwise manipulated. Such signals maybe referred to as bits, values, elements, symbols, characters, terms,numbers, or the like.

It should be borne in mind, however, that all of these and similar termsare to be associated with the appropriate physical quantities and aremerely convenient labels applied to these quantities. Unlessspecifically stated otherwise as apparent from the present disclosure,it is appreciated that throughout the description, certain terms referto the action and processes of a computer system, or similar electroniccomputing device, that manipulates and transforms data represented asphysical (electronic) quantities within the computer system's registersand memories into other data similarly represented as physicalquantities within the computer system memories or registers or othersuch information storage devices.

The present disclosure also relates to an apparatus for performing theoperations herein. This apparatus may be specially constructed for theintended purposes, or it may include a computer selectively activated orreconfigured by a computer program stored in the computer. Such acomputer program may be stored in a computer readable storage medium,such as, but not limited to, any type of disk including floppy disks,optical disks, CD-ROMs, and magnetic-optical disks, read-only memories(ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic oroptical cards, or any type of media suitable for storing electronicinstructions, each coupled to a computer system bus.

The algorithms and displays presented herein are not inherently relatedto any particular computer or other apparatus. Various other systems maybe used with programs in accordance with the teachings herein, or it mayprove convenient to construct a more specialized apparatus to performthe method. In addition, the present disclosure is not described withreference to any particular programming language. It will be appreciatedthat a variety of programming languages may be used to implement theteachings of the disclosure as described herein.

The present disclosure may be provided as a computer program product, orsoftware, that may include a machine-readable medium having storedthereon instructions, which may be used to program a computer system (orother electronic devices) to perform a process according to the presentdisclosure. A machine-readable medium includes any mechanism for storinginformation in a form readable by a machine (e.g., a computer). Forexample, a machine-readable (e.g., computer-readable) medium includes amachine (e.g., a computer) readable storage medium such as a read onlymemory (ROM), random access memory (RAM), magnetic disk storage media,optical storage media, flash memory devices, etc.

In the foregoing disclosure, implementations of the disclosure have beendescribed with reference to specific example implementations thereof. Itwill be evident that various modifications may be made thereto withoutdeparting from the broader scope of implementations of the disclosure asset forth in the following claims. Where the disclosure refers to someelements in the singular tense, more than one element can be depicted inthe figures and like elements are labeled with like numerals. Thedisclosure and drawings are, accordingly, to be regarded in anillustrative sense rather than a restrictive sense.

What is claimed is:
 1. A method comprising: receiving a netlist for achip comprising a bus; determining, by one or more processors and basedon the netlist, a first routing topology for the bus and through arouting region of the chip by comparing a demand of the bus to acapacity of a plurality of cells of the routing region; and generating alayout for the chip based on the first routing topology.
 2. The methodof claim 1, further comprising constructing, based on the netlist, atarget for the bus, wherein the bus comprises a plurality of pins of asource connected to a plurality of pins of the target, and whereinconstructing the target comprises determining a center pin of theplurality of pins of the source and a center pin of the plurality ofpins of the target.
 3. The method of claim 1, wherein comparing thedemand of the bus to the capacity of the plurality of cells of therouting region comprises comparing a first demand of a first portion ofthe bus to a capacity of a first cell of the plurality of cells andcomparing a second demand of a second portion of the bus to a capacityof a second cell of the plurality of cells, wherein the first cell isadjacent to the second cell in the routing region.
 4. The method ofclaim 3, wherein the first demand is greater than the second demand. 5.The method of claim 3, wherein, in the first routing topology, the firstportion of the bus is routed through the first cell and the secondportion of the bus is routed through the second cell.
 6. The method ofclaim 1, further comprising: determining, for the bus, a plurality ofrouting topologies through the routing region, the plurality of routingtopologies comprising the first routing topology; and determining a costfor each of the plurality of routing topologies by comparing a demand ofthe bus to a capacity of a plurality of cells of the routing region forthe respective routing topology, wherein the respective cost for thefirst routing topology is lower than the costs for the other routingtopologies of the plurality of routing topologies.
 7. The method ofclaim 6, further comprising increasing a cost for a second routingtopology of the plurality of routing topologies in response todetermining that a plurality of cells for the second routing topologycomprises a blockage.
 8. The method of claim 6, further comprisingcomparing a cost of a second routing topology of the plurality ofrouting topologies with a cost of a third routing topology of theplurality of routing topologies.
 9. The method of claim 1, furthercomprising undoing the first routing topology in response to determiningthat the first routing topology does not meet at least one of a timingrequirement or a topology requirement.
 10. The method of claim 9,further comprising adjusting the netlist in response to determining thatthe first routing topology does not meeting at least one of the timingrequirement or the topology requirement.
 11. The method of claim 1,further comprising duplicating the first routing topology for aplurality of bits of the bus.
 12. An apparatus comprising: a memory; anda hardware processor communicatively coupled to the memory, the hardwareprocessor configured to: receive a netlist for a chip comprising a bus;determine, based on the netlist, a first routing topology for the busand through a routing region of the chip by comparing a demand of thebus to a capacity of a plurality of cells of the routing region; andgenerate a layout for the chip based on the first routing topology. 13.The apparatus of claim 12, the hardware processor further configured toconstruct, based on the netlist, a target for the bus, wherein the buscomprises a plurality of pins of a source connected to a plurality ofpins of the target, and wherein constructing the target comprisesdetermining a center pin of the plurality of pins of the source and acenter pin of the plurality of pins of the target.
 14. The apparatus ofclaim 12, wherein comparing the demand of the bus to the capacity of theplurality of cells of the routing region comprises comparing a firstdemand of a first portion of the bus to a capacity of a first cell ofthe plurality of cells and comparing a second demand of a second portionof the bus to a capacity of a second cell of the plurality of cells,wherein the first cell is adjacent to the second cell in the routingregion.
 15. The apparatus of claim 14, wherein the first demand isgreater than the second demand.
 16. The apparatus of claim 14, wherein,in the first routing topology, the first portion of the bus is routedthrough the first cell and the second portion of the bus is routedthrough the second cell.
 17. The apparatus of claim 12, the hardwareprocessor further configured to: determine, for the bus, a plurality ofrouting topologies through the routing region, the plurality of routingtopologies comprising the first routing topology; and determine a costfor each of the plurality of routing topologies by comparing a demand ofthe bus to a capacity of a plurality of cells of the routing region forthe respective routing topology, wherein the respective cost for thefirst routing topology is lower than the costs for the other routingtopologies of the plurality of routing topologies.
 18. The apparatus ofclaim 17, the hardware processor further configured to increase a costfor a second routing topology of the plurality of routing topologies inresponse to determining that a plurality of cells for the second routingtopology comprises a blockage.
 19. The apparatus of claim 12, thehardware processor further configured to duplicate the first routingtopology for a plurality of bits of the bus.
 20. A method comprising:receiving a netlist for a chip comprising a bus; determining, by one ormore processors and based on the netlist, a first cost for a firstrouting topology for the bus and through a routing region of the chip bycomparing a demand of the bus to a capacity of a first plurality ofcells of the routing region; determining, by the one or more processorand based on the netlist, a second cost for a second routing topologyfor the bus and through the routing region of the chip by comparing thedemand of the bus to a capacity of a second plurality of cells of therouting region, wherein the second plurality of cells is different fromthe first plurality of cells; and in response to determining that thefirst cost is lower than the second cost, generating a layout for thechip based on the first routing topology.